Wen Congyang [Wed, 15 Jul 2015 07:45:47 +0000 (15:45 +0800)]
tools/libxl: rename remus device to checkpoint device
This patch is auto generated by the following commands:
1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' tools/libxl/Makefile
3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' tools/libxl/*.[ch]
4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' tools/libxl/*.[ch]
5. perl -pi -e 's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g' tools/libxl/*.[ch]
6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' tools/libxl/*.[ch]
7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' tools/libxl/*.[ch]
8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' tools/libxl/*.[ch]
9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' tools/libxl/libxl_internal.h
The patch also fixes the following backword compatibility:
The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
changed to ERROR_CHECKPOINT_XXX after previous renaming.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-Lightly-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Wed, 15 Jul 2015 07:45:44 +0000 (15:45 +0800)]
tools/libxl: export logdirty_init
We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Mon, 14 Dec 2015 07:01:44 +0000 (15:01 +0800)]
migration/save: pass checkpointed_stream from libxl to libxc
Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Mon, 14 Dec 2015 06:14:28 +0000 (14:14 +0800)]
tools/libxl: introduce enum type libxl_checkpointed_stream
Introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.
NOTE:
libxl_domain_restore_params and domain_create aren't changed here,
checkpointed_stream is still an int. Because we will pass the
value from libxl to libxc.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Wed, 15 Jul 2015 07:45:36 +0000 (15:45 +0800)]
libxl/save: Refactor libxl__domain_suspend_state
Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.
After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.
Also introduce libxl__domain_suspend_init to initialise the
libxl__domain_suspend_state.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Wed, 15 Jul 2015 07:45:35 +0000 (15:45 +0800)]
tools/libxl: move save/restore code into libxl_dom_save.c
This is purely code motion.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Wed, 15 Jul 2015 07:45:34 +0000 (15:45 +0800)]
tools/libxl: move remus code into libxl_remus.c
After previous refactoring, we are now able to move all remus code
into a separate file libxl_remus.c.
Export following functions for internal use:
- setup/teardown Remus:
* libxl__remus_setup
* libxl__remus_teardown
* libxl__remus_restore_setup
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Tue, 16 Feb 2016 03:41:16 +0000 (11:41 +0800)]
libxl/remus: init checkpoint callback in Remus setup callback
Init stream {read/write} state checkpoint_callback, suspend/resume/checkpoint
callback in Remus setup callback.
There's no functional change, it's just refactoring so that we can move
all remus code into one file.
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Shannon Zhao [Fri, 26 Feb 2016 11:37:50 +0000 (12:37 +0100)]
arm/acpi: Initialize serial port from ACPI SPCR table
Parse ACPI SPCR (Serial Port Console Redirection table) table and
initialize the serial port pl011.
Signed-off-by: Parth Dixit <parth.dixit@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Fix build.
Acked-by: Jan Beulich <jbeulich@suse.com>
Bob Moore [Fri, 26 Feb 2016 11:37:18 +0000 (12:37 +0100)]
ACPICA / Headers: Add support for CSRT and DBG2 ACPI tables
These tables are defined outside of the ACPI specification.
Signed-off-by: Bob Moore <robert.moore@intel.com>
[Linux commit
4e2f9c278ad84196991fcf6f6646a3e15967fe90]
[only port the DBG2 changes]
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Hanjun Guo [Fri, 26 Feb 2016 11:36:46 +0000 (12:36 +0100)]
ACPI / table: Print GIC information when MADT is parsed
When MADT is parsed, print GIC information as debug message:
ACPI: GICC (acpi_id[0x0000] address[
00000000e112f000] MPIDR[0x0] enabled)
ACPI: GICC (acpi_id[0x0001] address[
00000000e112f000] MPIDR[0x1] enabled)
...
ACPI: GICC (acpi_id[0x0201] address[
00000000e112f000] MPIDR[0x201] enabled)
This debug information will be very helpful to bring up early systems to
see if acpi_id and MPIDR are matched or not as spec defined.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
[Linux commit
4c1c8d7a7ebc8b909493a14b21b233e5377b69aa]
[Use container_of instead of cast and PRIx64 instead of %llx]
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Doug Goldstein [Fri, 26 Feb 2016 11:35:46 +0000 (12:35 +0100)]
build: convert HAS_CORE_PARKING to Kconfig
Convert HAS_CORE_PARKING to Kconfig as CONFIG_CORE_PARKING. While
removing HAS_CORE_PARKING, removed a trailing whitespace on a near by
line.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Doug Goldstein [Fri, 26 Feb 2016 11:33:14 +0000 (12:33 +0100)]
build: convert HAS_NUMA to Kconfig
Convert HAS_NUMA to Kconfig as CONFIG_NUMA and let CONFIG_NUMA be
defined by Kconfig.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Doug Goldstein [Fri, 26 Feb 2016 11:31:47 +0000 (12:31 +0100)]
build: consolidate CONFIG_HAS_ACPI and CONFIG_ACPI
No real advantage to keeping these separate. The use case of this from
Linux is when the platform or target board has support for something but
the user wants to be given the option to disable it.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 26 Feb 2016 11:31:11 +0000 (12:31 +0100)]
printk: introduce separator modifiers for the %ph custom format
The printk formats %*ph{C,D,N} are chosen to be compatible with their Linux
counterparts.
Sample:
(XEN) buf: 00 01 03 07 78 65 6e 00
(XEN) buf: 00:01:03:07:78:65:6e:00
(XEN) buf: 00-01-03-07-78-65-6e-00
(XEN) buf:
0001030778656e00
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 26 Feb 2016 11:30:55 +0000 (12:30 +0100)]
docs: update README to include Clang
Xen now builds on x86 with Clang 3.5 and 3.8. Update README to reflect this.
Mark Clang as no longer a permitted failure in Travis, to prevent future
regressions slipping back in.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
David Vrabel [Fri, 26 Feb 2016 11:30:11 +0000 (12:30 +0100)]
x86/hvm: add HVM_PARAM_X87_FIP_WIDTH
The HVM parameter HVM_PARAM_X87_FIP_WIDTH to allow tools and the guest
to adjust the width of the FIP/FDP registers to be saved/restored by
the hypervisor. This is in case the hypervisor hueristics do not do
the right thing.
Add this parameter to the set saved during domain save/migrate.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
David Vrabel [Fri, 26 Feb 2016 11:16:13 +0000 (12:16 +0100)]
x86/fpu: add a per-domain field to set the width of FIP/FDP
The x86 architecture allows either: a) the 64-bit FIP/FDP registers to
be restored (clearing FCS and FDS); or b) the 32-bit FIP/FDP and
FCS/FDS registers to be restored (clearing the upper 32-bits).
Add a per-domain field to indicate which of these options a guest
needs. The options are: 8, 4 or 0. Where 0 indicates that the
hypervisor should automatically guess the FIP width by checking the
value of FIP/FDP when saving the state (this is the existing
behaviour).
The FIP width is initially automatic but is set explicitly in the
following cases:
- 32-bit PV guest: 4
- Newer CPUs that do not save FCS/FDS: 8
The x87_fip_width field is placed into an existing 1 byte hole in
struct arch_domain.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Fix build.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 26 Feb 2016 11:15:36 +0000 (12:15 +0100)]
vVMX: use latched VMCS machine address
Instead of calling domain_page_map_to_mfn() over and over, latch the
guest VMCS machine address unconditionally (i.e. independent of whether
VMCS shadowing is supported by the hardware).
Since this requires altering the parameters of __[gs]et_vmcs{,_real}()
(and hence all their callers) anyway, take the opportunity to also drop
the bogus double underscores from their names (and from
__[gs]et_vmcs_virtual() as well).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Liang Z Li <liang.z.li@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Fri, 26 Feb 2016 11:15:09 +0000 (12:15 +0100)]
x86emul: simplify IRET logic
Since we only handle real mode, we need to consider neither non-ring0
nor IOPL. Also for POPF the mode_iopl() check can really be inside the
not-ring-0 body.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 26 Feb 2016 11:14:39 +0000 (12:14 +0100)]
x86emul: limit-check branch targets
All branches need to #GP when their target violates the segment limit
(in 16- and 32-bit modes) or is non-canonical (in 64-bit mode). For
near branches facilitate this via a zero-byte instruction fetch from
the target address (resulting in address translation and validation
without an actual read from memory), while far branches get dealt with
by breaking up the segment register loading into a read-and-validate
part and a write one. The latter at once allows correcting some
ordering issues in how the individual emulation steps get carried out:
Before updating machine state, all exceptions unrelated to that state
updating should have got raised (i.e. the only ones possibly resulting
in partly updated state are faulting memory writes [pushes]).
Note that while not immediately needed here, write and distinct read
emulation routines get updated to deal with zero byte accesses too, for
overall consistency.
Reported-by: 刘令 <liuling-it@360.cn>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Doug Goldstein [Thu, 25 Feb 2016 12:08:12 +0000 (13:08 +0100)]
x86: CONFIG_COMPAT is defined by Kconfig
Remove duplicate definition.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 25 Feb 2016 12:07:43 +0000 (13:07 +0100)]
x86: unilaterally remove .init mappings
Because of the new 2M alignment of .init and .bss, the existing memory
guarding infrastructure causes a shattered 2M superpage with non-present
entries for .init, and present entries for the alignment space.
Do away with the difference in behaviour between debug and non-debug builds;
always destroy the .init mappings, and reuse the space for xenheap.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:07:14 +0000 (13:07 +0100)]
x86: use 2M superpages for text/data/bss mappings
This balloons the size of Xen in memory from 4.4MB to 8MB, because of the
required alignment adjustments.
However
* All mappings are 2M superpages.
* .text (and .init at boot) are the only sections marked executable.
* .text and .rodata are marked read-only.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:06:44 +0000 (13:06 +0100)]
x86: reorder .data and .init when linking
In preparation for using superpage mappings, .data and .bss will both want to
be mapped as read-write. By making them adjacent, they can share the same
superpage and will not require superpage alignment between themselves.
While making this change, fix a latent alignment bug whereby the alignment for
.bss.stack_aligned was in .init. __init_end only needs page alignment (due to
being reclaimed after boot), while .bss.stack_aligned really does needs
STACK_SIZE alignment.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 25 Feb 2016 12:06:16 +0000 (13:06 +0100)]
x86: disable CR0.WP while applying alternatives
In preparation for marking .text as read-only, care needs to be taken not to
fault while applying alternatives.
Swapping back to RW mappings is a possibility, but would require additional
TLB management. A temporary disabling of CR0.WP is cleaner.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:05:33 +0000 (13:05 +0100)]
memguard: drop memguard_init() entirely
The use of MAP_SMALL_PAGES causes shattering of the superpages making up the
Xen virtual region, and is counter to the purpose of this series.
Furthermore, it is not required for the memguard infrastructure to function
(which itself uses map_pages_to_xen() for creating holes).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@citrix.com>
Andrew Cooper [Thu, 25 Feb 2016 12:05:09 +0000 (13:05 +0100)]
x86: construct the {l2,l3}_bootmap at compile time
... rather than at runtime.
The bootmaps are discarded in zap_low_mappings(), so the tables themselves can
live in .init.data and be reclaimed after boot.
Hooking the l1_identmap into l2_xenmap stays for safety, along with a longer
comment explaining why.
This does not change the EFI construction of {l2,l3}_bootmap. EFI already
constructs them cleanly in their relocated form.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:04:44 +0000 (13:04 +0100)]
x86: improvements to build-time pagetable generation
* Additional comments, including size and runtime use.
* Consistent use of .quad, rather than a mix including .long.
No change in runtime behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:03:43 +0000 (13:03 +0100)]
lockprof: move .lockprofile.data into .rodata
The entire contents of .lockprofile.data are unchanging pointers to
lock_profile structure in .data. Annotate the type as such, and link the
section in .rodata. As these are just pointers, 32byte alignment is
unnecessary.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Thu, 25 Feb 2016 12:03:04 +0000 (13:03 +0100)]
public: typo: use ' as apostrophe in grant_table.h
If grep 2.23 is installed, build fails like this:
...
mkdir -p compat
grep -v 'DEFINE_XEN_GUEST_HANDLE(long)' public/grant_table.h | \
python /home/SOURCES/xen/xen/xen.git/xen/tools/compat-build-source.py >compat/grant_table.c.new
mv -f compat/grant_table.c.new compat/grant_table.c
gcc ... -o compat/grant_table.i compat/grant_table.c
compat/grant_table.c:33:1: error: unterminated comment
/*
^
compat/grant_table.c:28:0: error: unterminated #ifndef
#ifndef __XEN_PUBLIC_GRANT_TABLE_H__
^
Makefile:62: recipe for target 'compat/grant_table.i' failed
make[3]: *** [compat/grant_table.i] Error 1
rm compat/grant_table.c
make[3]: Leaving directory '/home/SOURCES/xen/xen/xen.git/xen/include'
...
This is because grant_table.h contains this (note the
apostrophe): "granter\92s memory", and `grep -v', in version
2.23, stops processing the file (while, for instance,
until 2.22, this was not happening).
Although the above behavior is likely an issue in grep,
(https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22461)
I think we better switch to using " ' " in that line
anyway, as we do basically everywhere else (even in
the same file).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Andrew Cooper [Thu, 25 Feb 2016 12:02:29 +0000 (13:02 +0100)]
x86/hvm: print register state upon triple fault
A sample looks like:
(XEN) d1v0 Triple fault - invoking HVM shutdown action 1
(XEN) *** Dumping Dom1 vcpu#0 state: ***
(XEN) ----[ Xen-4.7-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 2
(XEN) RIP: 0000:[<
0000000000100005>]
(XEN) RFLAGS:
0000000000010002 CONTEXT: hvm guest (d1v0)
(XEN) rax:
0000000000000020 rbx:
0000000000000000 rcx:
0000000000000000
(XEN) rdx:
0000000000000000 rsi:
0000000000000000 rdi:
0000000000000000
(XEN) rbp:
0000000000000000 rsp:
0000000000000000 r8:
0000000000000000
(XEN) r9:
0000000000000000 r10:
0000000000000000 r11:
0000000000000000
(XEN) r12:
0000000000000000 r13:
0000000000000000 r14:
0000000000000000
(XEN) r15:
0000000000000000 cr0:
0000000000000011 cr4:
0000000000000000
(XEN) cr3:
0000000000000000 cr2:
0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0000
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 25 Feb 2016 12:01:01 +0000 (13:01 +0100)]
work around Clang generating .data.rel.ro section for init-only files
Clang-3.8 generates several .data.rel.ro sections when compiling Xen. As
these contain no global symbols, they should be .data.rel.ro.local. This
breaks the SPECIAL_DATA_SECTIONS check when converting the transition units to
being init-only.
For alternatives.c, explicitly move the nops arrays into __initconst. For efi
boot.c, manually create the optimisation performed by Clang by collapsing the
switch statement into a lookup table. The double use of const is required to
avoid breaking the ARM build by creating a section type conflict with
fdt_guid.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Haozhong Zhang [Thu, 25 Feb 2016 12:00:11 +0000 (13:00 +0100)]
x86/hvm: collect information of TSC scaling ratio
Both VMX TSC scaling and SVM TSC ratio use the 64-bit TSC scaling ratio,
but the number of fractional bits of the ratio is different between VMX
and SVM. This patch adds the architecture code to collect the number of
fractional bits and other related information into fields of struct
hvm_function_table so that they can be used in the common code.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Konrad Rzeszutek Wilk [Fri, 19 Feb 2016 14:26:02 +0000 (09:26 -0500)]
version: Document guest_handle
And what it is usually used for.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Doug Goldstein [Wed, 24 Feb 2016 11:06:28 +0000 (12:06 +0100)]
xenoprof: drop unnecessary macro
This macro doesn't really provide a benefit. When support is added the
implementer can implement this how it needs to be and not conform to the
macro. Additionally this change limits the output of the warning to just
once instead of nrpages worth. While editing this area I dropped
trailing whitespace.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Wed, 24 Feb 2016 11:05:58 +0000 (12:05 +0100)]
use XEN_SYSCTL_SCHEDOP_* for sysctl operation checks
In flask_sysctl_scheduler_op() and sched_adjust_global() the test for
the desired operation is done with the wrong constants. While the
values are correct, the names are not.
Correct the error message for the case of an unknown operation in
flask_sysctl_scheduler_op(), too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Huaitong Han [Wed, 24 Feb 2016 11:05:20 +0000 (12:05 +0100)]
x86/hvm: add pkeys support for cpuid handling
This patch adds pkeys support for cpuid handing.
Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
X86_FEATURE_OSXSAVE depends on guest X86_FEATURE_XSAVE, but cpu_has_xsave
function reflects hypervisor X86_FEATURE_XSAVE, it is fixed too.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Huaitong Han [Wed, 24 Feb 2016 11:04:50 +0000 (12:04 +0100)]
x86/hvm: add xstate support for pkeys
The XSAVE feature set can operate on PKRU state only if the feature set is
enabled (CR4.OSXSAVE = 1) and has been configured to manage PKRU state
(XCR0[9] = 1). And XCR0.PKRU is disabled on PV mode without PKU feature
enabled.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Huaitong Han [Wed, 24 Feb 2016 11:04:19 +0000 (12:04 +0100)]
x86/hvm: add pkeys support for guest_walk_tables
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.
PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 = i = 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.
A fault is considered as a PKU violation if all of the following conditions are
true:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Wed, 24 Feb 2016 11:03:32 +0000 (12:03 +0100)]
credit1: trace vCPU boost/unboost
Add tracepoints and a performance counter for
boosting and unboosting in Credit1.
Note that they (the trace points) do not cover
the case of the idle vCPU being boosted to run
a tasklet, as there already is
TRC_CSCHED_SCHED_TASKLET for that.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Wed, 24 Feb 2016 11:02:37 +0000 (12:02 +0100)]
sched: get rid of static private schedulers' structures
In fact, they look rather useless: they are never
referenced neither directly, nor via the sched_data
pointer, as a dynamic copy that overrides them is
allocated as the very first step of a scheduler's
initialization.
While there, take the chance to also reset the sched_data
pointer to NULL, upon scheduler de-initialization.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Mon, 22 Feb 2016 16:42:21 +0000 (17:42 +0100)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Ian Jackson [Mon, 22 Feb 2016 16:40:12 +0000 (16:40 +0000)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Mon, 22 Feb 2016 16:38:34 +0000 (17:38 +0100)]
common: re-arrange struct kernel_param fields
Even if placed in .init.* there's no reason to needlessly bloat the
binary due to padding fields the compiler needs to insert on 64-bit
architectures.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Mon, 22 Feb 2016 16:30:54 +0000 (17:30 +0100)]
Revert "init: annotate all command line parameter infrastructure as const"
This reverts commit
59b151d2c0bf37f3f2f984096d384dfdfa03a8f4,
as it breaks the build with older gcc.
Tamas K Lengyel [Mon, 22 Feb 2016 16:24:15 +0000 (17:24 +0100)]
x86/vm_event: consolidate hvm_event_fill_regs and p2m_vm_event_fill_regs
Currently the registers saved in the request depend on which type of event
is filling in the registers. In this patch we consolidate the two versions
of register filling function as to return a fix set of registers irrespective
of the underlying event.
Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Jan Beulich [Mon, 22 Feb 2016 16:23:08 +0000 (17:23 +0100)]
x86: drop register reload from INT80 malicious MSI guard
None of the restored registers are actually of interest to the
subsequent code (as opposed to the similar construct on the compat
mode hypercall path).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Doug Goldstein [Mon, 22 Feb 2016 16:21:58 +0000 (17:21 +0100)]
arm: CONFIG_ARM_{32, 64} defined by Kconfig
CONFIG_ARM_32 and CONFIG_ARM_64 is defined by Kconfig.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Doug Goldstein [Mon, 22 Feb 2016 16:21:03 +0000 (17:21 +0100)]
x86: CONFIG_X86 defined by Kconfig
CONFIG_X86 is defined by Kconfig when building for x86.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Mon, 22 Feb 2016 16:20:37 +0000 (17:20 +0100)]
x86/PVHv2: add XEN_ prefix to HVM_START_MAGIC_VALUE
Reported by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 22 Feb 2016 16:19:52 +0000 (17:19 +0100)]
introduce IS_ALIGNED()
And a few open-coded alignment checks which I encountered
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Mon, 22 Feb 2016 16:18:59 +0000 (17:18 +0100)]
sched: tracing: enable TSC tracing for all events
it is enabled for pretty much all of them already.
There were just a few that had it disabled.
When tracing a scheduler, timing information is
really important, so enable it everywhere scheduling
related.
Note that this was not really a problem if looking
at the traces with xenalyze, but it was if using
xentrace_format.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Mon, 22 Feb 2016 16:17:18 +0000 (17:17 +0100)]
init: annotate all command line parameter infrastructure as const
There is no reason for any of it to be modified. Additionally, link
.init.setup beside the other constant .init data.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
[jb: reduce alignments to 8]
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Doug Goldstein [Fri, 19 Feb 2016 02:57:03 +0000 (20:57 -0600)]
m4/python: fix typo in LDFLAGS variable name
[ also, reran autogen.sh ]
Reported-by: Jonathan Creekmore <jonathan.creekmore@gmail.com>
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Doug Goldstein [Fri, 19 Feb 2016 19:55:49 +0000 (13:55 -0600)]
MAINTAINERS: add Doug Goldstein for Travis CI config
Add myself as the maintainer for the Travis CI config.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 19 Feb 2016 18:31:30 +0000 (18:31 +0000)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Andrew Cooper [Fri, 12 Feb 2016 19:06:48 +0000 (19:06 +0000)]
tools/xenalyze: Fix build with clang
1) EXIT_REASON_EXCEPTION_NMI is 0, and Clang complains:
xenalyze.c:513:33: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
[EXIT_REASON_EXCEPTION_NMI]="EXCEPTION_NMI",
^~~~~~~~~~~~~~~
xenalyze.c:512:11: note: previous initialization is here
[0] = "NONE",
^~~~~~
2) cr3_time_compare(), eip_compare(), ipi_send() and cr3_compare_start() are
declared as nested functions, which is a GCCism not supported by Clang.
As they don't actually make use of the interesting feature offered by
nested functions (i.e. dynamic scoping), move them to just being normal
functions.
3) clear_interval_summary(), update_cpi() and clear_interval_cpi() are all
unused. The former isn't reference anywhere, so is deleted, while the other
two are called from currently #if 0'd code.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Ian Campbell [Wed, 17 Feb 2016 15:39:55 +0000 (15:39 +0000)]
xenpaging: don't try to log via xch if xc_interface_close fails
Since xch may not be valid (enough) any longer, xc_interface_close
already logs anything of any use before it tears down the integrated
logger so there is no need to log any further in the application via
that path.
CID:
1056203
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 17 Feb 2016 14:30:38 +0000 (14:30 +0000)]
tools: gtracestat: make all functions and global data static
After "Drop unused functions do_cstate and single_cstate helper" make
all the remaining functions and global data static and in the process
allow the compiler to notice that cond_rec_init() is also unused, thus
remove it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 17 Feb 2016 14:30:37 +0000 (14:30 +0000)]
tools: gtracestat: Drop unused functions do_cstate and single_cstate helper
These have always been dead code since the code was added AFAICT.
This eliminates the code containing CID
10567079,
10567080,
10567081
and
10567082 (all apparently some confusion between max_cx_num vs
MAX_CX_NR, but given the lack of callers its hard to tell what was
intended)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Doug Goldstein [Fri, 19 Feb 2016 04:25:57 +0000 (22:25 -0600)]
build: convert xenoprof to Kconfig
Convert the xenoprof x86 build time option to Kconfig.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Doug Goldstein [Fri, 19 Feb 2016 04:25:56 +0000 (22:25 -0600)]
xenoprof: fix up ability to disable it
Allow Xenoprof to be fully disabled when toggling the option off.
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Ian Campbell [Wed, 17 Feb 2016 14:04:15 +0000 (14:04 +0000)]
xl: create: close restore_fd_to_close on error
Currently the fd is opened and then later closed and
restore_fd_to_close set back to -1, however there are several goto out
and goto error_out paths in the interim.
Since the code resets restore_fd_to_close to -1 it is OK to check this
and close on the out path too.
CID:
1055897
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 17 Feb 2016 14:04:14 +0000 (14:04 +0000)]
xl: use xrealloc in domain create
Using bare realloc risks leaking the old pointer if the realloc fails.
Since xrealloc exits on such failures, drop the error handling.
Noticed while fixing, but not related to, CID
1055898.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 18 Feb 2016 12:37:04 +0000 (12:37 +0000)]
tools: libxl: Simplify logic in libxl__realloc
Replace the loop exit and separate test for loop overrun with an
assert in the loop body.
This simplifies the code. It also (hopefully) avoids Coverity
thinking that gc->alloc_maxsize might change, resulting in the loop
failing to find the right answer but also failing to abort.
(gc->alloc_maxsize can't change because gcs are all singlethreaded:
either they are on the stack of a specific thread, or they belong to
an ao and are covered by the ctx lock.)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Thu, 18 Feb 2016 16:48:09 +0000 (17:48 +0100)]
travis: drop bridge-utils and iproute2
These packages are not permitted inside travis, and are not necessary for
building Xen.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Corneliu ZUZU [Thu, 18 Feb 2016 16:47:36 +0000 (17:47 +0100)]
x86/monitor: minor left-shift undefined behavior checks
This minor patch adds a range-check to avoid left-shift caused undefined
behavior. Also replaces '1 <<' w/ '1U <<' @ x86 monitor.h in an effort to avoid
a future potential '1 << 31' that would cause a similar issue.
Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Doug Goldstein [Thu, 18 Feb 2016 16:47:15 +0000 (17:47 +0100)]
travis: add randconfig test target
Add another build target which uses randconfig to randomize the config
file so that we build test more than the default config.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Doug Goldstein [Thu, 18 Feb 2016 16:46:40 +0000 (17:46 +0100)]
add randconfig target to Makefile
This allows us to generate a random config which can be used for build
testing random configurations.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Konrad Rzeszutek Wilk [Thu, 18 Feb 2016 16:46:05 +0000 (17:46 +0100)]
mkelf32: Remove the 32-bit hypervisor support
We do not compile 32-bit hypervisor anymore so the code for
the ELFCLASS32 is effectively dead.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 18 Feb 2016 14:10:07 +0000 (15:10 +0100)]
x86: fix unintended fallthrough case from XSA-154
... and annotate the other deliberate one: Coverity objects otherwise.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
One of the two instances was actually a bug.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Corneliu ZUZU [Thu, 18 Feb 2016 14:08:25 +0000 (15:08 +0100)]
x86/hvm_event: fix uninitialized struct field usage introduced by c/s
f5365e6
c/s
f5365e6: "xen/vm-events: Move parts of monitor_domctl code to common-side",
introduced a use without initialization issue.
hvm_event_breakpoint calls hvm_event_traps(&req) and if sync is true that
ors some bits into req->flags which was never initialised.
Reported by Coverity Scan.
Initializes req @ hvm_event_breakpoint entry.
Coverity-ID:
1353192
Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 18 Feb 2016 14:07:59 +0000 (15:07 +0100)]
avoid left shifting into a sign bit
Clang 3.8 notices, and objects because it is undefined behaviour.
"error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]"
Use unsigned constants rather than signed ones.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Feng Wu <feng.wu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 18 Feb 2016 14:07:33 +0000 (15:07 +0100)]
x86: drop failsafe callback invocation from assembly
Afaict this was never necessary on a 64-bit hypervisor, and was instead
just blindly cloned over from 32-bit code: We don't fiddle with (and
hence don't reload) any of DS, ES, FS, or GS, and an exception on IRET
itself can equally well be reported to the guest as that very exception
on the target of that IRET.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 18 Feb 2016 14:07:11 +0000 (15:07 +0100)]
VMX: fold redundant code
No need to do this in two slightly different ways, possibly keeping the
compiler from folding the code for us.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 18 Feb 2016 14:05:34 +0000 (15:05 +0100)]
x86emul: fix rIP handling
Deal with rIP just like with any other register: Truncate to designated
width upon entry, write back the zero-extended 32-bit value when
emulating 32-bit code, and leave the upper 48 bits unchanged for 16-bit
code.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 18 Feb 2016 14:05:00 +0000 (15:05 +0100)]
x86/mm: slightly simplify mod_l1_entry()
Re-order code to simplify error cleanup.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Dario Faggioli [Thu, 18 Feb 2016 14:04:23 +0000 (15:04 +0100)]
RTDS: pack trace data better for xentrace_format
when tracing runstate changes, the vcpu and domain IDs
are encoded in the lower and higher, respectively, parts
of a 32 bits integer. When decoding a trace with
xentrace_format, this makes it possible to display
such events like this:
CPU0
833435853624 (+ 768) running_to_runnable [ dom:vcpu = 0x7fff0000 ]
CPU0
833435854416 (+ 792) runnable_to_running [ dom:vcpu = 0x00000007 ]
For consistency, we should do the same when displaying
the events coming from the RTDS scheduler (when using
the same tool), and to do that, we need to invert the
order in which the fields are being put in the trace
struct right now.
While there, we also:
- fix the use of TRC_RTDS_SCHED_TASKLET (it should
only be involved when a tasklet is scheduled, not
_every_ time rt_schedule() is invoked!);
- remove a very chatty and useless (nothing has been
picked!) use of TRC_RTDS_RUNQ_PICK.
In fact, one can already figure out when nothing has been
picked from the runqueue, by looking at when cpu_idle
is invoked --which is the same thing one would do if on
Credit or Credit2.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Thu, 18 Feb 2016 14:04:00 +0000 (15:04 +0100)]
credit2: pack trace data better for xentrace_format
when tracing runstate changes, the vcpu and domain IDs
are encoded in the lower and higher, respectively, parts
of a 32 bits integer. When decoding a trace with
xentrace_format, this makes it possible to display
such events like this:
CPU0
833435853624 (+ 768) running_to_runnable [ dom:vcpu = 0x7fff0000 ]
CPU0
833435854416 (+ 792) runnable_to_running [ dom:vcpu = 0x00000007 ]
For consistency, we should do the same when displaying
the events coming from the Credit2 scheduler (when using
the same tool), and to do that, we need to invert the
order in which the fields are being put in the trace
struct right now.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Thu, 18 Feb 2016 14:03:34 +0000 (15:03 +0100)]
sched: improve domain creation tracing
by doing the following two things:
- move TRC_SCHED_DOM_{ADD,REM}, into the functions
that do the actual scheduling-related domain
initialization;
- add two 'generic' DOM_{ADD,REM} events. They're
made part of the TRC_DOM0 tracing class, as Dom0
is, usually, from where domains are created and
destroyed.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Thu, 18 Feb 2016 14:03:15 +0000 (15:03 +0100)]
sched: move up the trace record for vcpu_wake and vcpu_sleep
vcpu_wake() and vcpu_sleep() are called before the specific
schedulers wakeup and sleep routines (in fact, it is them
that calls those specific routine).
Make the trace reflect that, by moving the records up. In
fact, it is more natural and easy to find the record of
the event (e.g., the wakeup) *before* the records of the
actions that deals with the event itself.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Thu, 18 Feb 2016 14:02:52 +0000 (15:02 +0100)]
credit: __runq_tickle takes a useless cpu parameter
as it is always acts on v->processor of the vcpu that
we are tickling.
Getting rid of it makes the code easier to understand
and better looking.
While there, remove a spurious blank line.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Thu, 18 Feb 2016 14:02:16 +0000 (15:02 +0100)]
x86: avoid flush IPI when possible
Since CLFLUSH, other than WBINVD, is a cache coherency domain wide
flush, there's no need to IPI other CPUs if this is the only flushing
being requested. (As a secondary change, move a local variable into the
scope where it's actually needed.)
As a secondary change also eliminate another leftover from 32-bit days:
invalidate_interrupt() can clear FLUSH_TLB_GLOBAL alongside FLUSH_TLB,
since write_ptbase() (as a descendant of __sync_local_execstate()) now
unconditionally fiddles with CR4.PGE.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Harmandeep Kaur [Fri, 12 Feb 2016 11:08:32 +0000 (16:38 +0530)]
libxc: fix leak of t_info in xc_tbuf_get_size()
Avoid leaking the memory mapping of the trace buffer
Coverity ID
1351228
Signed-off-by: Harmandeep Kaur <write.harmandeep@gmail.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Dirk Behme [Thu, 4 Feb 2016 16:49:35 +0000 (17:49 +0100)]
xen/arm64: Make sure we get all debug output
Starting in the wrong ELx mode I get the following debug output:
...
- Current EL
00000004 -
- Xen must be entered in NS EL2 mode -
- Boot failed -
The output of "Please update the bootloader" is missing here, because
string concatenation in gas, unlike in C, keeps the \0 between each
individual string.
Make sure this is output, too. With this, we get
...
- Current EL
00000004 -
- Xen must be entered in NS EL2 mode -
- Please update the bootloader -
- Boot failed -
as intended.
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- added same change to arm32 case ]
Ian Campbell [Wed, 17 Feb 2016 14:58:33 +0000 (14:58 +0000)]
xenpaging: do not leak if --pagefile given twice
By freeing filename (which is either NULL or the previous iteration of
this argument). This implements a semantic where the last --pagefile
given on the command line takes precedence.
This is the same semantic as the other options have.
CID:
1198792
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Jim Fehlig [Wed, 17 Feb 2016 17:20:58 +0000 (10:20 -0700)]
docs: fix typo in xl-disk-configuration.txt
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jim Fehlig [Wed, 17 Feb 2016 17:20:57 +0000 (10:20 -0700)]
libxlu_cfg: reject unknown characters following '\'
When dequoting config strings in xlu__cfgl_dequote(), unknown
characters following a '\', and the '\' itself, are discarded.
E.g. a disk configuration string containing
rbd:pool/image:mon_host=192.168.0.100\:6789
would be dequoted as
rbd:pool/image:mon_host=192.168.0.
1006789
Instead of discarding the '\' and unknown character, reject the
string and set error to EINVAL.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Doug Goldstein [Wed, 17 Feb 2016 15:24:29 +0000 (16:24 +0100)]
x86/PMU: make {acquire,release}_pmu_ownership names consistent
The function names were inconsistent with acquire and release being
called acquire_pmu_ownership() and release_pmu_ownship() respectively.
Function prototypes were available for both spellings so this change
makes them consistent and drops the dual function prototypes.
Additionally change the internal variable names within those functions
to ownership as well.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Jan Beulich [Wed, 17 Feb 2016 15:23:31 +0000 (16:23 +0100)]
Revert "x86/HVM: differentiate IO/mem resources tracked by ioreq server"
This reverts commit
f5a32c5b8eacbcd727939c9b4d2d98cf619bcbd6;
we're aiming at a different solution now.
Roger Pau Monné [Wed, 17 Feb 2016 15:22:21 +0000 (16:22 +0100)]
x86/PVHv2: update the start info structure layout
After some discussion around the new boot ABI consensus has been reached
about the layout and contents of the start info. The following patch updates
the layout to what has been agreed.
Also, the new layout is described in binary terms in order to avoid issues
with alignments when using C structs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Wed, 17 Feb 2016 15:21:48 +0000 (16:21 +0100)]
MAINTAINERS: add myself as seabios maintainer
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Wed, 17 Feb 2016 15:21:20 +0000 (16:21 +0100)]
public: make some constants usable for assembler
Some constants defined in xen/include/public/xen.h are not usable in
assembler sources as they are either defined with "U" or "UL" suffixes
or they are inside #ifndef __ASSEMBLY__ areas.
Change this as grub2 could make use of those definitions.
This requires to move the definition of mk_unsigned_long() up. While
we are touching this macro, rename it in order to avoid namespace
pollution. This in turn requires adaption of some arch-x86 specific
headers.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Wed, 17 Feb 2016 15:20:35 +0000 (16:20 +0100)]
cleanup xen/config.h
config.h contains an unused definition of mk_unsigned_long().
Remove it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Wed, 17 Feb 2016 15:20:01 +0000 (16:20 +0100)]
x86emul: relax asm() constraints
Let's give the compiler as much liberty at picking instruction operands
as possible. Also drop unnecessary size modifiers when the correct size
can already be derived from the asm() operands. Finally also drop an
"unsigned" from idiv_dbl()'s second parameter, allowing a cast to be
eliminated.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 17 Feb 2016 15:19:27 +0000 (16:19 +0100)]
x86emul: fold almost identical code
AAM/AAD as well as DAA/DAS emulation code is respectively almost
identical. Fold each pair, following what's already the case for
AAA/AAS.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 17 Feb 2016 15:18:50 +0000 (16:18 +0100)]
x86/HVM: fold hypercall tables
In order to reduce the risk of unintentionally adding a function
pointer to just one of the two tables, merge them into one, with each
entry pair getting generated by a single macro invocation (at once
dropping all explicit casting outside the macro definition).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 17 Feb 2016 15:18:08 +0000 (16:18 +0100)]
x86/VMX: sanitize rIP before re-entering guest
... to prevent guest user mode arranging for a guest crash (due to
failed VM entry). (On the AMD system I checked, hardware is doing
exactly the canonicalization being added here.)
Note that fixing this in an architecturally correct way would be quite
a bit more involved: Making the x86 instruction emulator check all
branch targets for validity, plus dealing with invalid rIP resulting
from update_guest_eip() or incoming directly during a VM exit. The only
way to get the latter right would be by not having hardware do the
injection.
Note further that there are a two early returns from
vmx_vmexit_handler(): One (through vmx_failed_vmentry()) leads to
domain_crash() anyway, and the other covers real mode only and can
neither occur with a non-canonical rIP nor result in an altered rIP,
so we don't need to force those paths through the checking logic.
This is CVE-2016-2271 / XSA-170.
Reported-by: 刘令 <liuling-it@360.cn>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 17 Feb 2016 15:16:53 +0000 (16:16 +0100)]
x86: enforce consistent cachability of MMIO mappings
We've been told by Intel that inconsistent cachability between
multiple mappings of the same page can affect system stability only
when the affected page is an MMIO one. Since the stale data issue is
of no relevance to the hypervisor (since all guest memory accesses go
through proper accessors and validation), handling of RAM pages
remains unchanged here. Any MMIO mapped by domains however needs to be
done consistently (all cachable mappings or all uncachable ones), in
order to avoid Machine Check exceptions. Since converting existing
cachable mappings to uncachable (at the time an uncachable mapping
gets established) would in the PV case require tracking all mappings,
allow MMIO to only get mapped uncachable (UC, UC-, or WC).
This also implies that in the PV case we mustn't use the L1 PTE update
fast path when cachability flags get altered.
Since in the HVM case at least for now we want to continue honoring
pinned cachability attributes for pages not mapped by the hypervisor,
special case handling of r/o MMIO pages (forcing UC) gets added there.
Arguably the counterpart change to p2m-pt.c may not be necessary, since
UC- (which already gets enforced there) is probably strict enough.
Note that the shadow code changes include fixing the write protection
of r/o MMIO ranges: shadow_l1e_remove_flags() and its siblings, other
than l1e_remove_flags() and alike, return the new PTE (and hence
ignoring their return values makes them no-ops).
This is CVE-2016-2270 / XSA-154.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>